-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table features interfaces/utilities for Kernel #4
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Super detailed and thorough. Thanks for this! Left some comments.
true); | ||
|
||
/** Whether commands modifying this Delta table are allowed to create new deletion vectors. */ | ||
public static final TableConfig<Boolean> ENABLE_DELETION_VECTORS_CREATION = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would it be nice if all of our table-feature-enablement-related TableConfigs ended with _ENABLED
?
e.g. APPEND_ONLY_ENABLED
, CHANGE_DATA_FEED_ENABLED
, DELETION_VECTORS_CREATION_ENBABLED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
* | ||
* <p>Separately, a feature can be automatically supported by a table's metadata when certain | ||
* feature-specific table properties are set. For example, `changeDataFeed` is automatically | ||
* supported when there's a table property `delta.enableChangeDataFeed=true`. See {@link |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this mean that in the metadata / protocol that is written to the lob, delta.enableChangeDataFeed=true
is in the metadata (table property) but changeDataFeed
is not in the protocol's reader/writer features?
so do we just, implicitly, when we create the Protocol
action, we add that changeDataFeed
reader/write feature?
and then do we write out the new-and-improved-and-corrected Protocol next time we write?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this mean that in the metadata / protocol that is written to the lob, delta.enableChangeDataFeed=true is in the metadata (table property) but changeDataFeed is not in the protocol's reader/writer features?
Thats is correct.
so do we just, implicitly, when we create the Protocol action, we add that changeDataFeed reader/write feature?
Yep, if all the metadata requirements are satisfied.
and then do we write out the new-and-improved-and-corrected Protocol next time we write?
Yes
* @param featureName a globally-unique string indicator to represent the feature. All characters | ||
* must be letters (a-z, A-Z), digits (0-9), '-', or '_'. Words must be in camelCase. | ||
* @param minReaderVersion the minimum reader version this feature requires. For a feature that | ||
* can only be explicitly supported, this is either `0` or `3` (the reader protocol version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also remind us what 0
means -->
0
(this is a writer feature, not a reader feature) or 3
(the reader protocol ....)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calrified.
checkArgument( | ||
featureName.chars().allMatch(c -> Character.isLetterOrDigit(c) || c == '-' || c == '_'), | ||
"name contains invalid characters: " + featureName); | ||
this.minReaderVersion = minReaderVersion; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkArgument >= 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
featureName.chars().allMatch(c -> Character.isLetterOrDigit(c) || c == '-' || c == '_'), | ||
"name contains invalid characters: " + featureName); | ||
this.minReaderVersion = minReaderVersion; | ||
this.minWriterVersion = minWriterVersion; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats the min? checkArgument >= 2? or whaterver it is supposed to be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to have writer version 1.
} | ||
} | ||
|
||
public static final TableFeature VARIANT_FEATURE = new VariantTypeTableFeature("variantType"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are your thoughts on VARIANT_RW_FEATURE
?
and then DOMAIN_METADATA_W_FEATURE
?
i.e. include the readerWriter or writer-ness in the variable name itself? this could help throughout the code to re-enforce understanding of the features?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no prefernece. Added the prefixes.
// can be explicitly added to the protocol's `readerFeatures` | ||
supportsReaderFeatures(); | ||
|
||
boolean shouldAddToWriterFeatures = supportsWriterFeatures(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we need to check if the feature is a writer feature?
* Determine whether this protocol can be safely upgraded to a new protocol `to`. This means: - | ||
* all features supported by this protocol are supported by `to`. | ||
* | ||
* <p>Examples regarding feature status: - from `[appendOnly]` to `[appendOnly]` => allowed. - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the -
bullet points should be on newlines
*/ | ||
public Protocol normalized() { | ||
// Normalization can only be applied to table feature protocols. | ||
if (!supportsWriterFeatures()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about readerFeatures? is there a relationship between them? i.e. does supportsReaderFeatures imply supportsWriterFeatures?
* <p>For example: (1, 7, AppendOnly, Invariants, CheckConstraints) -> (1, 3) (3, 7, RowTracking) | ||
* -> (1, 7, RowTracking) | ||
*/ | ||
public Protocol normalized() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is normalize the right/best word here? wasn't obvious to me. what about minimized
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and then maximized
for denormalized
} | ||
|
||
/** A base class for all table legacy reader-writer features. */ | ||
public abstract static class LegacyReaderWriterFeature extends TableFeature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this implement ReaderWriterFeatureType
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it must? Otherwise isReaderWriterFeature
is wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, added it.
public boolean isReaderWriterFeature() { | ||
return this instanceof TableFeatures.ReaderWriterFeatureType; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a silly question, but do we need this interface at all? Can we just infer this from minReaderVersion
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in your doc you mentioned this might do a check and require that minReaderVersion
is overriden to be non-zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I guess I see it is used below in validate
. Still wondering what value the interface uses though and if we would just always use this method rather than checking instanceof
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a descriptive way. When you see a table feature and what is inheirting, we know it is a readerwriter or writer only feature.
we could depend on minReaderVersion = 0, but that's just hacky.
Which Delta project/connector is this regarding?
Description
How was this patch tested?
Does this PR introduce any user-facing changes?